面部表现攻击检测(PAD)对于保护面部识别(FR)应用程序至关重要。 FR性能已被证明对某些人口统计学和非人口统计学组是不公平的。但是,面部垫的公平性是一个研究的问题,这主要是由于缺乏适当的注释数据。为了解决此问题,这项工作首先通过组合几个知名的PAD数据集,在其中提供了七个人类宣传的属性标签,从而提出了一个组合的注释数据集(CAAD-PAD)。然后,这项工作通过研究我们的CAAD-Pad上的四个面部垫方法,全面分析了一组面垫的公平及其与培训数据的性质和操作决策阈值分配(ODTA)的关系。同时代表垫子的公平性和绝对垫性能,我们引入了一种新颖的指标,即准确性平衡公平(ABF)。关于CAAD-PAD的广泛实验表明,训练数据和ODTA会引起性别,遮挡和其他属性组的不公平性。基于这些分析,我们提出了一种数据增强方法Fairswap,该方法旨在破坏身份/语义信息和指南模型以挖掘攻击线索而不是与属性相关的信息。详细的实验结果表明,Fairswap通常可以提高垫子性能和面部垫的公平性。
translated by 谷歌翻译
Body segmentation is an important step in many computer vision problems involving human images and one of the key components that affects the performance of all downstream tasks. Several prior works have approached this problem using a multi-task model that exploits correlations between different tasks to improve segmentation performance. Based on the success of such solutions, we present in this paper a novel multi-task model for human segmentation/parsing that involves three tasks, i.e., (i) keypoint-based skeleton estimation, (ii) dense pose prediction, and (iii) human-body segmentation. The main idea behind the proposed Segmentation--Pose--DensePose model (or SPD for short) is to learn a better segmentation model by sharing knowledge across different, yet related tasks. SPD is based on a shared deep neural network backbone that branches off into three task-specific model heads and is learned using a multi-task optimization objective. The performance of the model is analysed through rigorous experiments on the LIP and ATR datasets and in comparison to a recent (state-of-the-art) multi-task body-segmentation model. Comprehensive ablation studies are also presented. Our experimental results show that the proposed multi-task (segmentation) model is highly competitive and that the introduction of additional tasks contributes towards a higher overall segmentation performance.
translated by 谷歌翻译
Image-based virtual try-on techniques have shown great promise for enhancing the user-experience and improving customer satisfaction on fashion-oriented e-commerce platforms. However, existing techniques are currently still limited in the quality of the try-on results they are able to produce from input images of diverse characteristics. In this work, we propose a Context-Driven Virtual Try-On Network (C-VTON) that addresses these limitations and convincingly transfers selected clothing items to the target subjects even under challenging pose configurations and in the presence of self-occlusions. At the core of the C-VTON pipeline are: (i) a geometric matching procedure that efficiently aligns the target clothing with the pose of the person in the input images, and (ii) a powerful image generator that utilizes various types of contextual information when synthesizing the final try-on result. C-VTON is evaluated in rigorous experiments on the VITON and MPV datasets and in comparison to state-of-the-art techniques from the literature. Experimental results show that the proposed approach is able to produce photo-realistic and visually convincing results and significantly improves on the existing state-of-the-art.
translated by 谷歌翻译
Recent state-of-the-art face recognition (FR) approaches have achieved impressive performance, yet unconstrained face recognition still represents an open problem. Face image quality assessment (FIQA) approaches aim to estimate the quality of the input samples that can help provide information on the confidence of the recognition decision and eventually lead to improved results in challenging scenarios. While much progress has been made in face image quality assessment in recent years, computing reliable quality scores for diverse facial images and FR models remains challenging. In this paper, we propose a novel approach to face image quality assessment, called FaceQAN, that is based on adversarial examples and relies on the analysis of adversarial noise which can be calculated with any FR model learned by using some form of gradient descent. As such, the proposed approach is the first to link image quality to adversarial attacks. Comprehensive (cross-model as well as model-specific) experiments are conducted with four benchmark datasets, i.e., LFW, CFP-FP, XQLFW and IJB-C, four FR models, i.e., CosFace, ArcFace, CurricularFace and ElasticFace, and in comparison to seven state-of-the-art FIQA methods to demonstrate the performance of FaceQAN. Experimental results show that FaceQAN achieves competitive results, while exhibiting several desirable characteristics.
translated by 谷歌翻译
Face image quality assessment (FIQA) attempts to improve face recognition (FR) performance by providing additional information about sample quality. Because FIQA methods attempt to estimate the utility of a sample for face recognition, it is reasonable to assume that these methods are heavily influenced by the underlying face recognition system. Although modern face recognition systems are known to perform well, several studies have found that such systems often exhibit problems with demographic bias. It is therefore likely that such problems are also present with FIQA techniques. To investigate the demographic biases associated with FIQA approaches, this paper presents a comprehensive study involving a variety of quality assessment methods (general-purpose image quality assessment, supervised face quality assessment, and unsupervised face quality assessment methods) and three diverse state-of-theart FR models. Our analysis on the Balanced Faces in the Wild (BFW) dataset shows that all techniques considered are affected more by variations in race than sex. While the general-purpose image quality assessment methods appear to be less biased with respect to the two demographic factors considered, the supervised and unsupervised face image quality assessment methods both show strong bias with a tendency to favor white individuals (of either sex). In addition, we found that methods that are less racially biased perform worse overall. This suggests that the observed bias in FIQA methods is to a significant extent related to the underlying face recognition system.
translated by 谷歌翻译
In this paper, we aim to address the large domain gap between high-resolution face images, e.g., from professional portrait photography, and low-quality surveillance images, e.g., from security cameras. Establishing an identity match between disparate sources like this is a classical surveillance face identification scenario, which continues to be a challenging problem for modern face recognition techniques. To that end, we propose a method that combines face super-resolution, resolution matching, and multi-scale template accumulation to reliably recognize faces from long-range surveillance footage, including from low quality sources. The proposed approach does not require training or fine-tuning on the target dataset of real surveillance images. Extensive experiments show that our proposed method is able to outperform even existing methods fine-tuned to the SCFace dataset.
translated by 谷歌翻译
The emergence of COVID-19 has had a global and profound impact, not only on society as a whole, but also on the lives of individuals. Various prevention measures were introduced around the world to limit the transmission of the disease, including face masks, mandates for social distancing and regular disinfection in public spaces, and the use of screening applications. These developments also triggered the need for novel and improved computer vision techniques capable of (i) providing support to the prevention measures through an automated analysis of visual data, on the one hand, and (ii) facilitating normal operation of existing vision-based services, such as biometric authentication schemes, on the other. Especially important here, are computer vision techniques that focus on the analysis of people and faces in visual data and have been affected the most by the partial occlusions introduced by the mandates for facial masks. Such computer vision based human analysis techniques include face and face-mask detection approaches, face recognition techniques, crowd counting solutions, age and expression estimation procedures, models for detecting face-hand interactions and many others, and have seen considerable attention over recent years. The goal of this survey is to provide an introduction to the problems induced by COVID-19 into such research and to present a comprehensive review of the work done in the computer vision based human analysis field. Particular attention is paid to the impact of facial masks on the performance of various methods and recent solutions to mitigate this problem. Additionally, a detailed review of existing datasets useful for the development and evaluation of methods for COVID-19 related applications is also provided. Finally, to help advance the field further, a discussion on the main open challenges and future research direction is given.
translated by 谷歌翻译
我们介绍了一种新方法,使用一组体积原语(即超Quadrics)重建3D对象。该方法层次结构将目标3D对象分解为对成对的超季度,从而恢复了更细致的细节。尽管以前已经研究过这种层次结构方法,但我们仅使用预测的超质学属性引入了一种新的方法来分裂对象空间。该方法在Shapenet数据集上进行了训练和评估。我们的实验结果表明,可以通过针对具有复杂几何形状的各种对象的方法来获得合理的重建。
translated by 谷歌翻译
尽管最近的面部识别(FR)系统在许多部署场景中取得了出色的成果,但它们在挑战现实世界中的表现仍在质疑。因此,面部图像质量评估(FIQA)技术旨在通过为它们提供示例质量信息来支持FR系统,这些信息可用于拒绝不适合识别目的的质量差数据。文献中已经提出了几组依赖不同概念的FIQA方法,所有这些方法都可以用于生成质量的面部图像,这些面部图像可以用作伪造的(质量)标签,并可以利用进行训练(回归 - 基于)质量估计模型。几个fiqa批准\ - 表明可以从与某些面部匹配器生成的配对相似度分布中提取大量样本质量信息。基于这种见解,我们在本文中提出了一种质量标签优化方法,该方法将来自配对配置的相似性的样本质量信息纳入现有现成的FIQA技术的质量预测。我们使用三种不同数据集的三种最先进的FIQA方法评估了建议的方法。我们的实验结果表明,提出的优化过程在很大程度上取决于执行的优化迭代次数。在十个迭代中,该方法似乎执行了最佳,始终超过三种FIQA方法的基本质量得分,这是为实验所选择的。
translated by 谷歌翻译
本文介绍了基于2022年国际生物识别技术联合会议(IJCB 2022)举行的基于隐私感知合成训练数据(SYN-MAD)的面部变形攻击检测的摘要。该竞赛吸引了来自学术界和行业的12个参与团队,并在11个不同的国家 /地区举行。最后,参与团队提交了七个有效的意见书,并由组织者进行评估。竞争是为了介绍和吸引解决方案的解决方案,这些解决方案涉及检测面部变形攻击的同时,同时出于道德和法律原因保护人们的隐私。为了确保这一点,培训数据仅限于组织者提供的合成数据。提交的解决方案提出了创新,导致在许多实验环境中表现优于所考虑的基线。评估基准现在可在以下网址获得:https://github.com/marcohuber/syn-mad-2022。
translated by 谷歌翻译